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(1)  Foreword 

This  progress  report  summarizes  two  types  of  work.  In  the  first  part  of  our  study, 
we  performed  experimental  analysis  of  the  underlying  cortical  circuitry  required  to 
analyze  objects  moving  across  the  skin.  In  the  second  part  of  the  study,  we  applied  the 
metrics  that  we  obtained  to  a  biologically  faithful  model  and  showed  that  this  model  was 
extremely  efficient  at  detecting  objects  within  natural  cluttered  scenes. 
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(4)  Statement  of  the  problem  studied. 

The  specific  aim  of  this  project  was  to  test  the  hypothesis  (through  parallel  neurophysiological 
experiments  and  neural  network  modeling  simulations)  that  perceptual  mislocalization  of  a  moving  tactile 
stimulus  arises  from  a  systematic  misrepresentation  of  stimulus  location  on  the  skin  by  primary 
somatosensory  cerebral  cortex  (SI).  Experimentally,  SI  cortical  experiments  substantiated  the  original 
hypothesis  by  demonstrating  that  the  pattern  of  neural  activity  evoked  in  SI  cortex  by  a  moving  skin  stimulus 
varies  with  stimulus  velocity  in  a  manner  paralleling  that  of  perception.  In  the  modeling  studies,  a  novel 
model  of  synaptic  input  integration  by  dendrites  of  cortical  pyramidal  cells  was  developed  which  enables 
cells  to  tune  to  higher-order  stimulus  features.  Studies  with  the  model  also  supported  the  original 
hypothesis.  Additionally,  this  network  model  was  tested  for  its  ability  to  extract  higher  order  features  of 
sensory  input  patterns,  and  it  was  shown  to  be  very  successful  at  extending  current  techniques  of  nonlinear 
factor  analysis.  In  this  progress  report,  we  demonstrate  its  use  in  automatic  target  recognition,  or  more 
specifically,  in  recognizing  military  vehicles  in  real-world  settings. 
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(5)  Summary  of  the  most  important  results. 

I.  Biological  findings. 

An  object’s  motion  has  prominent  and  diverse  effects  on  its  perceived  position.  For  example,  at 
high  velocities  of  skin  brushing  stimulation,  both  the  first  and  the  last  skin  points  contacted  by  the 
brush  are  perceived  to  be  shifted  in  the  direction  of  brush  motion,  and  the  skin  path  taken  by  the 
brushing  stimulus  is  perceived  to  be  much  shorter  than  it  really  is  (Whitsel  et  al.,  1986).  In  vision, 
such  effects  of  stimulus  velocity  on  the  perceived  positions  of  the  start  and  end  points  are  known 
as  Frohlich  and  Flash-Lag  Effects,  respectively  (Whitney,  2002). 

To  explore  the  neural  bases  of  these  prominent  perceptual  phenomena,  response  of  the  primary 
somatosensory  cortex  (SI)  to  skin-brushing  stimulation  was  studied  in  monkeys  using  the 
methods  of  near-infrared  optical  intrinsic  signal  (OIS)  imaging  of  SI  stimulus-evoked  activity  and 
extracellular  recording  of  the  spike  discharge  activities  of  SI  neurons.  OIS  findings  clearly  show 
that  the  spatial  distribution  of  the  optical  response  in  SI  is  velocity-  and  direction-dependent:  (1) 
the  region  of  SI  activation  is  much  smaller  at  100-200cm/sec  than  at  10-50  cm/sec;  and  (2)  at 
higher  stimulus  velocities  the  optical  response  shifts  its  location  in  SI  in  the  direction  of  stimulus 
motion.  In  Figure  1  below,  the  difference  in  the  optical  responses  to  flutter  at  two  locations  versus 
a  stimulus  moving  across  the  2  points  on  the  skin  is  shown.  When  the  moving  stimulus  travels 
between  two  points  on  the  skin  at  a  relatively  low  velocity  (5  cm/sec  in  this  example),  it  activates 
a  fairly  large  region  of  SI  cortex.  When  the  same  stimulus  is  sped  up  (to  200  cm/sec),  activation 
is  observed  most  prominently  in  the  cortical  region  that  corresponds  with  the  skin  region  that  the 
stimulus  is  moving  towards  (thus,  this  figure  demonstrates  that  the  response  of  SI  cortex  is  both 
velocity  and  direction  dependent). 


Flutter  @  #1 


Flutter  @  #2 


Moving  from  #1  to  #2.  5  cm/sec 


Moving  from  #1  to  #2.  200  cm/sec 


#1  #2 


X  hindlimb  X 


Figure  1.  Imaged  response  of  moving  stimulus. 
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Cutaneous  mechanoreceptor  afferents  were  found  to  respond  to  low-  and  high-velocity  skin- 
brushing  stimulation  with  spike  discharges  that  occurred  at  virtually  the  same  brush  positions, 
thus  indicating  that  the  velocity-dependent  shift  of  OIS  response  in  SI  must  have  central  origins. 
Among  SI  neurons,  at  high  brushing  velocities  50%  of  neurons  in  areas  3b  and  1  (Group  I) 
showed  a  significant  displacement  of  stimulus-evoked  firing  to  later  positions  along  the  stimulus 
path,  but  this  shift,  however,  is  attributable  simply  to  the  conduction  delay  of  on  average  20  msec 
between  the  skin  and  the  SI  cortex.  The  other  50%  of  neurons  in  areas  3b  and  1  (Group  II)  were 
more  interesting:  although  the  ON-edge  of  their  response  profile  also  shifts  with  stimulus  velocity 
due  to  20  msec  latency,  the  OFF-edge  of  their  response  profile  is  mostly  velocity-invariant;  or, 
more  precisely,  it  follows  a  shallow  U-shaped  course  as  velocity  is  increased  from  1  to 
250cm/sec,  remarkably  similar  to  the  behavior  of  the  perceived  locus  of  the  final  position  of  the 
brushing  stimulus  in  the  human  tactile  psychophysical  studies.  In  another  close  parallel  to 
psychophysics,  the  length  of  the  response  profile  of  Group  II  neurons  shows  the  same 
dependency  on  brushing  velocity,  as  does  the  perceived  length  of  skin  brushed  by  a  stimulus. 
Both  curves  are  remarkably  similar  in  their  decline  at  velocities  from  1  to  5cm/sec,  plateau 
between  5  and  30cm/sec,  and  another  decline  at  velocities  above  30cm/sec. 

In  conclusion,  the  similarity  of  effects  of  stimulus  velocity  on  the  response  of  Group  II  neurons  in 
SI  and  on  perceived  stimulus  position  in  human  psychophysics  suggests  that  the  perceptual 
distortions  of  a  brushing  stimulus  position  on  the  skin  have  their  neural  counterparts  in  the 
distortions  of  the  SI  representation  of  the  position  of  such  a  stimulus.  At  the  same  time,  the 
existence  of  Group  I  neurons  in  SI  suggests  that  these  neural  correlates  of  psychophysics  are  not 
a  universal  property  of  all  SI  neurons. 


II.  Computational  Findings 

As  a  part  of  this  ARO-funded  research  project  (P-43077-LS;  Tommerdahl,  P.I.),  we  have 
developed  a  computational  ‘SINBAD’  model  of  how  cerebral  cortical  neurons  learn  to  recognize 
higher-order  features  in  their  sensory  environments  (Ryder  and  Favorov,  2001;  Favorov  and 
Ryder,  2004;  Favorov  et  al.,  2003).  This  work  led  us  to  formulate  a  novel  computational 
‘SINBAD’  algorithm  that  significantly  extends  current  techniques  of  nonlinear  factor  analysis 
(Kursun  and  Favorov,  2004a).  To  demonstrate  the  analytical  powers  of  this  algorithm,  we  have 
successfully  applied  it  to  computer-science  problems  of  super-resolution  and  human  face 
recognition  (Kursun  and  Favorov,  2003,  2004b).  We  are  also  currently  applying  a  version  of  this 
algorithm  (called  ‘Virtual  Scientist’;  Kursun  and  Favorov,  2004a)  to  metabolomics,  a  field  in 
functional  genomics.  Another  challenging,  but  potentially  valuable  practical  application  of  the 
SINBAD  algorithm,  suggested  by  Dr.  Schmeisser,  is  to  use  it  in  automatic  target  recognition,  such 
as,  for  example,  recognizing  military  vehicles  in  real-world  settings.  This  paper  is  a  report  of  our 
initial  progress  on  such  a  target-recognition  task. 


Approach 

The  study  was  carried  out  on  the  TNO-TM  Search_2  dataset  of  44  high-resolution 
photographs  of  cluttered  rural  scenes  containing  9  types  of  military  vehicles  (Toet  et  al.,  1998). 
24  of  those  images  were  used  to  train  our  procedures.  10  other  randomly  chosen  images  were 
used  to  test  their  performance  after  training.  Our  approach  is  based  on  a  hierarchical  iterative 
procedure  that  involves:  (1)  learning  SINBAD  features  characteristic  of  the  kinds  of  patterns 
encountered  in  the  database  images  and  using  these  features  to  identify  possible  target 
locations,  (2)  developing  additional  SINBAD  features  specifically  of  such  ‘suspicious’  locations 
and  using  these  specialized  features  to  narrow-down  the  set  of  possible  target  locations,  (3) 
developing  another  set  of  even  more  specialized  SINBAD  features  on  this  narrowed-down  set  of 
locations  and  using  them  to  further  reduce  the  number  of  false  detections,  and  so  on. 
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SINBAD  features  are  learned  by  a  network  of  SINBAD  cells.  Each  cell  implements  a 
machine-learning  algorithm  designed  to  extract  mutual  information  from  different,  but  related  sets 
of  inputs  (Favorov  and  Ryder,  2004;  Kursun  and  Favorov,  2004a).  Applying  this  algorithm  to  our 
target  recognition  problem,  we  use  it  to  learn  various  characteristic  types  of  local  structures 
present  in  natural  landscape  images.  Such  local  structures  are  distinguished  by  certain  degrees 
of  internal  redundancy  and  this  redundancy  is  used  by  SINBAD  neurons  both  to  discover  such 
structures  and  to  learn  to  distinguish  among  their  different  categories. 

To  make  a  decision  on  whether  a  given  image  location  contains  a  target  (a  vehicle),  we 
use  a  Support  Vector  Machine  (SVM;  Scholkopf  and  Smola,  2002).  SVMs  have  an  important 
advantage  over  other  types  of  classifiers  in  that  they  can  be  successfully  trained  even  on  very 
limited  numbers  of  training  samples  (and  even  when  input  vectors  are  very  high-dimensional, 
which  is  our  case),  avoid  over-fitting,  have  excellent  generalizing  abilities,  and  converge  very 
quickly.  The  essential  features  of  our  approach  are  illustrated  in  Figure  2.  The  first  SVM  (SVM-i) 
is  trained  to  respond  positively  when  its  small  viewing  window  is  placed  over  a  military  vehicle  in 
any  given  training  image.  Of  course,  as  can  be  expected,  this  SVM  fails  to  learn  this 
classification  task  perfectly:  in  order  to  avoid  missing  any  vehicles,  it  incorrectly  responds  to  views 
of  nature  as  if  they  contained  a  vehicle  on  0.35%  of  trials.  We  use  SVM-i  for  an  initial  search  of  a 
given  full-size  image:  we  scan  the  SVM  viewing  window  over  the  entire  image  and  select  for 
further  analysis  those  locations  where  SVM-i  responded  positively.  By  doing  this,  we  quickly 
discard  99.65%  of  locations  in  the  image  as  of  no  interest.  However,  we  are  still  left  with  a  large 
number  of  locations  that  might  contain  a  target. 

For  the  second  stage,  we  develop  SINBAD  features  using  only  those  image  locations  that 
were  identified  as  ‘suspicious’  by  SVM^  For  this  report,  we  used  a  network  of  14  SINBAD  cells 
(‘SINBAD  Network  T  in  Figure  2).  Each  cell  learns  a  different  feature  within  the  same  5x5  pixel 
viewing  window.  Together,  the  outputs  of  these  SINBAD  cells  represent  the  image  content  of  a 
5x5  pixel  window  by  a  vector  in  a  14-dimensional  ‘feature’  space.  SINBAD  features,  in  turn,  are 
used  as  inputs  to  the  second  SVM  (SVM2  in  Figure  2).  SVM2  is  trained  to  recognize  the  presence 
of  a  vehicle  within  a  20x20  pixel  window.  During  training,  the  window  is  placed  only  at  those 
image  locations  that  were  marked  as  ‘suspicious’  by  SVM-|.  SVM2  greatly  reduces  the  number  of 
False  Positives  that  were  made  by  SVM!  -  currently  by  a  factor  of  20  -  without  missing  any  of  the 
real  vehicles  in  the  test  images.  Thus,  a  sequence  of  SVM-i  and  SINBAD-SVM2  in  our 
experiments  so  far  was  able  to  detect  all  the  test  vehicles  while  making  False  Positive  mistakes 
on  only  0.01 5%  of  the  test  trials. 

These  False  Positives  can  be  reduced  further  by  one  or  more  additional  SINBAD-SVM 
stages  (one  such  stage  is  shown  in  Figure  2).  At  each  stage,  SINBAD  features  can  be  developed 
specifically  for  those  image  locations  that  were  considered  suspicious  by  the  preceding  stages  of 
the  analysis.  Such  specialized  SINBAD  features  will  exhibit  progressively  greater  discriminative 
sensitivity  to  image  details  specific  to  the  ‘suspected’  (i.e.,  containing  a  vehicle  or  not  yet  ruled 
out)  image  locations.  The  enhanced  sensitivity,  in  turn,  can  enable  the  next  SVM  to  improve  its 
classification  performance,  further  reducing  the  numbers  of  False  Positives. 
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FIGURE  2 
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Results 


Figure  3  shows  the  outcomes  of  the  first  two  stages  of  vehicle  detection  performed  on  one  of  the 
dataset  images.  This  image  was  not  used  in  either  SVM-i  or  SINBAD  or  SVM2  training,  but  was 
reserved  for  testing  the  performance  of  those  algorithms  after  their  training.  The  local  windows 
shown  in  the  bottom  two  panels  indicate  image  locations  at  which  SVMs  signaled  the  presence  of 
a  vehicle. 

As  shown  in  the  third  panel,  in  stage  1  SVM-i  correctly  signaled  the  location  of  a  tank  in  the 
image.  However,  it  also  signaled  155  other  locations,  which  did  not  contain  any  vehicles  (False 
Positives).  In  stage  2,  153  of  these  155  False  Positive  locations  were  correctly  discarded  by 
SVM2  (see  the  bottom  panel).  SVM2  identified  8  locations  in  the  image  as  possibly  containing  a 
vehicle,  6  of  them  correctly  (covering  different  parts  of  the  same  tank)  and  only  2  incorrectly. 
Thus,  Figure  3  demonstrates  the  effectiveness  of  our  approach  of  developing  specialized 
SINBAD  features  of  suspected  image  locations  and  training  a  new  SVM  on  those  features  in 
greatly  reducing  the  numbers  of  False  Positives  without  reducing  the  ability  to  find  the  true 
targets. 

Figure  4  shows  suspected  vehicle  locations  identified  by  SINBAD-SVM2  in  six  other  test  images. 
Each  panel  shows  (1)  a  part  of  the  original  image,  (2)  suspected  vehicle  locations  (small 
squares),  and  (3)  a  view  in  which  the  suspected  locations  and  their  surroundings  are  highlighted 
to  make  clearer  the  landscape  structures  that  were  mistaken  by  SVM2  for  a  vehicle.  Visual 
inspection  of  those  structures  shows  that  most  of  the  mistaken  structures  do  not  look  like  vehicles 
(e.g.,  tree  trunks  or  branches),  which  suggests  that  it  should  be  possible  for  the  next-stage 
SINBAD-SVM  to  learn  to  correctly  interpret  such  structures  as  non-targets. 

Interestingly,  according  to  Toet  et  al.  (1998)  human  observers  had  difficulties  finding  a  vehicle  in 
database  image  1 1  (top-left  panel  in  Figure  4),  with  1 8  out  of  62  observers  failing  to  find  it.  Image 
1 1  was  one  of  our  test  images  and,  in  a  favorable  contrast,  SVM2  had  no  difficulties  detecting  this 
target  and,  furthermore,  without  generating  many  False  Positives.  This  superior  performance 
was  repeated  on  database  image  2  (top-right  panel  in  Figure  4),  which  was  also  difficult  for 
human  observers  (16  out  62  failed). 


In  conclusion,  we  believe  that  the  already  impressive  vehicle-detection  performance  achieved  so 
far  by  the  SVMrSINBAD-SVM2  sequence  can  be  raised  even  much  higher  by  incorporating 
additional  developments  in  future  research  studies  into  our  set  of  image  analysis  procedures. 
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